{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "a870e71d",
   "metadata": {},
   "source": [
    "# Python 文件与模块"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1257bf93",
   "metadata": {},
   "source": [
    "## 1.4.1 文件"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "77a65ea6",
   "metadata": {},
   "source": [
    "经过几百万年的演化，人类进化出了其他动物没有的“长期记忆”，进而拥有了语言、文化甚至发展出文明，走上了一条未曾设想的道路。在人类大脑中，“短期记忆”由海马体负责，“长期记忆”则由大脑皮层中的额叶负责；在计算机中也同样存在长短期记忆，短期记忆由程序中的变量负责，长期记忆则交给电脑中的文件系统。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8199ed32",
   "metadata": {},
   "source": [
    "### 1.4.1.1 Open 函数"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6638d46b",
   "metadata": {},
   "source": [
    "Python `open( )` 函数用于打开一个文件，并返回文件对象，在对文件进行处理过程都需要使用到这个函数。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "508c28c5",
   "metadata": {},
   "source": [
    "> 为了测试 python 中的文件操作，我们先通过命令行在当前目录保存一个内容为 “hello world!” 的 test.txt 文件。\n",
    "\n",
    "<img style=\"float: center;\" src=\"./src/fig41.png\" width=\"50%\"> "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd7d8214",
   "metadata": {},
   "source": [
    "`open(file, mode)` 函数主要有 file 和 mode 两个参数，其中 file 为需要读写文件的路径。mode 为读取文件时的模式，常用的模式有以下几个：\n",
    "\n",
    "- r：以字符串的形式读取文件。\n",
    "- rb：以二进制的形式读取文件。\n",
    "- w：写入文件。\n",
    "- a：追加写入文件。\n",
    "\n",
    "不同模式下返回的文件对象功能也会不同。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "85581df5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class '_io.TextIOWrapper'>\n"
     ]
    }
   ],
   "source": [
    "file = open('test.txt', 'r')\n",
    "print(type(file))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4297989",
   "metadata": {},
   "source": [
    "### 1.4.1.2 文件对象"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b2665bee",
   "metadata": {},
   "source": [
    "`open` 函数会返回一个 文件对象。在进行文件操作前，我们首先需要了解文件对象提供了哪些常用的方法：\n",
    "\n",
    "- `close( )`: 关闭文件\n",
    "- 在 r 与 rb 模式下：\n",
    "    - `read()`: 读取整个文件\n",
    "    - `readline()`: 读取文件的一行\n",
    "    - `reallines()`: 读取文件的所有行\n",
    "- 在 w 与 a 模式下：\n",
    "    - `write()`: \n",
    "    - `weitelines()`: "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0660d622",
   "metadata": {},
   "source": [
    "下面我们通过实例学习这几种方法："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "c42d7de4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello world!\n",
      "Hello Python!!\n",
      "Hello smart way!!!\n"
     ]
    }
   ],
   "source": [
    "## 通过 read 方法读取整个文件\n",
    "content = file.read()\n",
    "print(content)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "0604c0d2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "## 通过 readline() 读取文件的一行\n",
    "content = file.readline()\n",
    "print(content)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "921fae61",
   "metadata": {},
   "source": [
    "我们发现上面的代码竟然什么也没输出，这是为什么？\n",
    "\n",
    "你可以理解 open 函数返回的是一个指针，类似于你在 Microsolf Word 文档里编辑文档时闪烁的光标。在执行过 `file.read( )` 操作后，由于读取了整个文件，这个指针已经来到了文件末尾，因此 `file.readline( )` 没有获取到文件的内容。这种情况我们可以通过 close 方法关闭文件后再重新打开。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "7d180532",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello world!\n",
      "\n"
     ]
    }
   ],
   "source": [
    "## 关闭之前打开的 test.txt 文件\n",
    "file.close()\n",
    "## 重新打开\n",
    "file = open('test.txt', 'r')\n",
    "content = file.readline()\n",
    "print(content)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1ebaed11",
   "metadata": {},
   "source": [
    "因此在操作文件时，我们一定要注意每次操作结束后，及时通过 `close( )` 方法关闭文件。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "8388fe2d",
   "metadata": {},
   "outputs": [],
   "source": [
    "## 以 w 模式打开文件test.txt\n",
    "file = open('test.txt', 'w')\n",
    "## 创建需要写入的字符串变量 在字符串中 \\n 代表换行（也就是回车）\n",
    "content = 'Hello world!\\nHello Python!!\\n'\n",
    "## 写入到 test.txt 文件中\n",
    "file.write(content)\n",
    "## 关闭文件对象\n",
    "file.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a4726843",
   "metadata": {},
   "source": [
    "<img style=\"float: center;\" src=\"./src/fig42.png\" width=\"50%\"> "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4db4bd05",
   "metadata": {},
   "source": [
    "需要注意在上述操作中，w 模式会覆盖之前的文件。如果你想在文件后面追加内容，可以使用 a 模式操作。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "f3352cfd",
   "metadata": {},
   "outputs": [],
   "source": [
    "## 以 w 模式打开文件test.txt\n",
    "file = open('test.txt', 'a')\n",
    "## 创建需要追加的字符串变量\n",
    "content = 'Hello smart way!!!'\n",
    "## 写入到 test.txt 文件中\n",
    "file.write(content)\n",
    "## 关闭文件对象\n",
    "file.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01915069",
   "metadata": {},
   "source": [
    "<img style=\"float: center;\" src=\"./src/fig43.png\" width=\"50%\"> "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4d487fad",
   "metadata": {},
   "source": [
    "## 1.4.2 模块"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8bab259",
   "metadata": {},
   "source": [
    "在普罗米修斯为人类带来火种后，人类制造出了各种各样的工具。工具的产生大大提高了人类的生产力，使人类从石器时代步入铁器时代、蒸汽时代以及现在的信息时代。\n",
    "\n",
    "在之前的学习中，为了更简洁高效地完成任务，我们引入了函数、数据结构以及面向对象编程等工具。但每次我们在这个代码写完后，在下一个代码中只能把代码复制粘贴过去，而模块可以帮助我们把一个代码添加到另一个代码中，真正实现了工具等复用性。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac5aa0a2",
   "metadata": {},
   "source": [
    "编写模块的方式有很多，其中最简单的模块就是创建一个包含很多函数、变量以及类并以 .py 为后缀的文件。下面我们把上一节中实现的 class 类保存在 student.py 文件中："
   ]
  },
  {
   "cell_type": "markdown",
   "id": "46b45d90",
   "metadata": {},
   "source": [
    "<img style=\"float: center;\" src=\"./src/fig44.png\" width=\"50%\"> "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ef0e6af7",
   "metadata": {},
   "source": [
    "使用 `import` 关键字可以把一个模块引入到一个程序来使用它的功能。记得在上一节中我们说过一个程序也可以是一个对象，这时 student.py 程序就成了一个对象，而里面的 student 类便成了它的对象变量。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "8de02c29",
   "metadata": {},
   "outputs": [],
   "source": [
    "import student"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "4c9f7a60",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('XiaoHu', 65, 55)\n"
     ]
    }
   ],
   "source": [
    "xiaohu = student.student('XiaoHu', 65, 55)\n",
    "print(xiaohu)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b91e95f",
   "metadata": {},
   "source": [
    "通过 student.py 这个模块，我们不需要重复编写就可以在任何程序中使用 student 类！正是因为有了模块，才使得数据科学中复杂的算法与模型可以封装到模块中，用户只需要使用模块中定义好的函数与类。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c422e8ea",
   "metadata": {},
   "source": [
    "当我们只需要模块中的几个函数或类时，也可以采用 `from model_name import xxx` 的写法导入指定部分："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "abf9e578",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('XiaoHu', 65, 55)\n"
     ]
    }
   ],
   "source": [
    "## 仅导入 student 类\n",
    "from student import student\n",
    "## 这时直接通过类名，不需要使用模块名\n",
    "xiaohu = student('XiaoHu', 65, 55)\n",
    "print(xiaohu)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3024271d",
   "metadata": {},
   "source": [
    "在 Python 中内置了很多标准模块，例如用于数学操作的 math 模块与处理时间的 time 模块："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "323741ef",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4.174387269895637\n"
     ]
    }
   ],
   "source": [
    "import math\n",
    "## 通过 math.log 计算数值的对数\n",
    "print(math.log(xiaohu.math_score))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "67f56bad",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Thu Jul  1 09:51:16 2021\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "## 通过 time 库中多个方法获取当前时间\n",
    "## time.time 获取当前时间的 unix 时间戳\n",
    "## time.loacltime 把当前时间的 unix 时间戳按照时区划分\n",
    "## time.asctime 把按时区划分的时间戳转化成标准日期格式\n",
    "print(time.asctime(time.localtime(time.time())))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "389616ba",
   "metadata": {},
   "source": [
    "除此之外，Python 还有很多标准库值得我们去探索。例如 re 库可以实现字符串正则表达式的匹配，random 用来生成随机数，urllib 用来访问互联网……"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "77869207",
   "metadata": {},
   "source": [
    "## 1.4.3 练习"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "709140e5",
   "metadata": {},
   "source": [
    "### 1.4.3.1 学生信息本地存储"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d0897a3",
   "metadata": {},
   "source": [
    "> 之前我们实现了学生信息的“增删查改”功能，现在希望通过文件系统保存学生的信息，每次打开程序时可以载入，关闭程序时可以保存。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f0673d08",
   "metadata": {},
   "source": [
    "### 1.4.3.2 Python 之蝉"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7d95cc3f",
   "metadata": {},
   "source": [
    "<blockquote>\n",
    "\n",
    "首先在这里恭喜各位同学完成了 Python 的全部学习内容！在这四章中希望你可以体会到本教程简洁明快的风格，这种风格与 Python 本身的代码设计风格是一致的。在 Tim Peters 编写的 “Python 之禅” 中的核心指导原则就是 “Simple is better than complex”。\n",
    "\n",
    "纵观历史，你会看到苹果创始人史蒂夫·乔布斯对 Less is More 的追求，看到无印良品“删繁就简，去其浮华”的核心设计理念，看到山下英子在《断舍离》中对生活做减法的观点，甚至看到苏东坡“竹杖芒鞋轻胜马，一蓑烟雨任平生”的人生态度。你会发现极简主义不只存在于 Python 编程中，它本就是这个世界优雅的一条运行法则。\n",
    "    \n",
    "本练习让我们导入 this 模块，用心体会一下 Python 的设计原则。\n",
    " \n",
    "</blockquote>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "d918ee0e",
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The Zen of Python, by Tim Peters\n",
      "\n",
      "Beautiful is better than ugly.\n",
      "Explicit is better than implicit.\n",
      "Simple is better than complex.\n",
      "Complex is better than complicated.\n",
      "Flat is better than nested.\n",
      "Sparse is better than dense.\n",
      "Readability counts.\n",
      "Special cases aren't special enough to break the rules.\n",
      "Although practicality beats purity.\n",
      "Errors should never pass silently.\n",
      "Unless explicitly silenced.\n",
      "In the face of ambiguity, refuse the temptation to guess.\n",
      "There should be one-- and preferably only one --obvious way to do it.\n",
      "Although that way may not be obvious at first unless you're Dutch.\n",
      "Now is better than never.\n",
      "Although never is often better than *right* now.\n",
      "If the implementation is hard to explain, it's a bad idea.\n",
      "If the implementation is easy to explain, it may be a good idea.\n",
      "Namespaces are one honking great idea -- let's do more of those!\n"
     ]
    }
   ],
   "source": [
    "import this"
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.2"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
